Accepted by IEEE Transactions on Intelligent Vehicles
Jinbao Zhang, Jun Liu, Yu Pei, Jingwei Zhang, Xian Zhao
3D object detection has achieved great progress recently. However, the contradiction between high accuracy and rapid inference is a crucial issue, which is particularly evident in voxel-based networks and pillar-based networks. Voxel-based networks can achieve high accuracy, but the 3D sparse convolution backbone in voxel-based networks blocks real-time inference and model deployment. Pillar-based networks are deploymentfriendly and can achieve real-time inference, but they cannot perform as accurately as voxel-based networks. To reduce the gap in accuracy between these two types of networks as well as keep the inference speed of pillar-based networks, in this paper, we propose Learn from Voxel Knowledge Distillation (LVKD), an effective voxel-to-pillar knowledge distillation framework. In our LVKD, we design Sparse Convolution to Pillar Knowledge Distillation (SCP KD) to transfer the rich knowledge from the voxel-based teacher network to the student network. The SCP KD selects the crucial regions and transfers the rich information from the teacher network. In addition, to alleviate the representation differences between teacher and student networks and improve the performance of distillation, we propose the Voxel Occupancy Prediction module, a plug-and-play task that encourages the pillar-based network to predict the occupancy of each voxel to achieve the reconstruction of structure and spatial information. We conduct experiments on two popular public datasets (i.e., nuScenes and KITTI), and the results demonstrate the superiority of the proposed LVKD framework. In particular, our LVKD framework can enhance the performance of the pillarbased network by 3.1% in mean average precision and 2.6% in the nuScenes detection score.
Figure: Illustration of detection results. The boxes in red and blue are the predicted and GT bounding boxes, respectively. (a), (b) and (c) show the results of the voxel-based teacher detector, the pillar-based student detector without LVKD, and the pillar-based student detector with LVKD, respectively.
Accepted by IEEE Transactions on Intelligent Vehicles
Junning Zhang, Siyuan Huang, Jun Liu, Xiaoxiu Zhu, Feng Xu
Point Cloud Registration (PCR) has been viewed as an essential part of photogrammetry, remote sensing, and autonomous robot mapping. Existing methods are either sensitive to rotation transformations, or rely on feature learning networks with poor generalization. We propose a novel outdoor point cloud registration algorithm, including preprocessing, yaw angle estimation, coarse registration, and fine registration (in short, PYRF-PCR). Specifically, the preprocessing effectively eliminates the interference of ground point clouds to PCR. The proposed yaw angle estimator solves large yaw-angle matching via a cross-correlation function that converts the focus from the yaw angle estimation to the LiDAR horizontal angular resolution analyses. Then, by using frequency distribution histograms, we improve the fast point feature histogram algorithm to filter the point clouds with a more stable density. For the fine registration, an improved iterative closest point based on target centroid distance is proposed, which reduces the running time and the search range between two point clouds. To validate the widespread applicability of PYRF-PCR, we experimented on both the open-source dataset (KITTI) and the local campus scene dataset. On the KITTI dataset, experimental results illustrate that the PYRF-PCR can achieve state-of-the-art results compared with the existing best methods. On the local scene datasets, the higher quality matching in different types of target point clouds reflects the generalization ability of PYRF-PCR.
Figure: Visualization of registration results. As observed, our method achieves good alignments on all these scan pairs. What is more practical is the leading performance under the scene of a large turning radius, this feature is very important in the external environment. We select two regions, A and B, to focus on the registration effect, which demonstrates the merit of our algorithm in PCR.